Architectural Alternatives for Recurrent Networks
نویسنده
چکیده
This paper describes a class of recurrent neural networks related to Elman networks. The networks used herein Figure 1: Architecture of Elman's recurrent network; ω differ from standard Elman networks in that they may signifies total interconnection with trainable weights; 1 have more than one state vector. Such networks have an signifies that the activations at the destination are a explicit representation of the hidden unit activations from copy of the activations at the source in the previous several steps back. In principle, a single-state-vector processing cycle. network is capable of learning any sequential task that a of inputs which includes no instances of type tk, say, then multi-state-vector network can learn. This paper describes the partially-trained network might map such an input to experiments which show that, in practice, and for the a random type. In particular, for a binary classification learning task used, a multi-state-vector network can learn network, (i.e. T = {+,–}) a standard backpropagation a task faster and better than a single-state-vector network. network must be trained on inputs of type + and of type –. The task used involved learning the graphotactic structure There are two reasons why this model of learning of a sample of about 400 English words. training on examples and non-examples is inappropriate to learning syntax, as in Elman's task, or graphotactics,
منابع مشابه
Learning Performance of Networks like Elman ’ s Simple Recurrent Networks but having Multiple State Vectors
Target Papers: • William H. Wilson, A comparison of architectural alternatives for recurrent networks, Proceedings of the Fourth Australian Conference on Neural Networks, ACNN’93, Melbourne, 13 February 1993, 189-192. ftp://ftp.cse.unsw.edu.au/pub/users/billw/wilson.recurrent.ps.Z • William H. Wilson, Stability of learning in classes of recurrent and feedforward networks, in Proceedings of the ...
متن کاملArchitectural Bias in Recurrent Neural Networks - Fractal Analysis
We have recently shown that when initialized with “small” weights, recurrent neural networks (RNNs) with standard sigmoid-type activation functions are inherently biased towards Markov models, i.e. even prior to any training, RNN dynamics can be readily used to extract finite memory machines (Hammer & Tiňo, 2002; Tiňo, Čerňanský & Beňušková, 2002; Tiňo, Čerňanský & Beňušková, 2002a). Following ...
متن کاملCombining Recurrent and Convolutional Neural Networks for Relation Classification
This paper investigates two different neural architectures for the task of relation classification: convolutional neural networks and recurrent neural networks. For both models, we demonstrate the effect of different architectural choices. We present a new context representation for convolutional neural networks for relation classification (extended middle context). Furthermore, we propose conn...
متن کاملApproaches Based on Markovian Architectural Bias in Recurrent Neural Networks
Recent studies show that state-space dynamics of randomly initialized recurrent neural network (RNN) has interesting and potentially useful properties even without training. More precisely, when initializing RNN with small weights, recurrent unit activities reflect history of inputs presented to the network according to the Markovian scheme. This property of RNN is called Markovian architectura...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2007